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We consider a two-letter self-avoiding (square) lattice heteropolymer model of Nh (out of A'^) 
attracting sites. At zero temperature, permanent links are formed leading to collapse structures for 
any fraction pn ~ Nh/N. The average chain size scales as i? ~ N^^''F{ph) (d is space dimension). 
As pH — > 0, F{ph) ~ pIj with C, = 1/d — v = —1/4 for d — 2. Moreover, for < pn < 1, entropy 
approaches zero as A'^ ^ oo (being finite for a homopolymer). An abrupt decrease in entropy occurs 
at the phase boundary between the swollen {R ~ A'^") and collapsed region. Scaling arguments 
predict different regimes depending on the ensemble of crosslinks. Some implications to the protein 
folding problem are discussed 



PACS. 05.70Fh, 61.25Hq, 87.15By 



The role of crosslinks in polymers have relevant appli- 
cations for many kind of systems like proteins, DNA and 
other copolymers. Recently, it has been shown that 
random crosslinking of residues imposes stringent con- 
straints in the protein folding kinetics. Assuming that 
there is one "correct" set of crosslinks resembling the na- 
tive structure, it is found that the time needed for fast 
folding sequences to reach this state scales as TV'^, where 
N is the number of monomers (or residues) in the chain 
and A ~ 3 (A ~ 4 at the onset). The model suggests 
that the size of the critical nucleus is on the order of the 
system size. Polymer crosslinking is also important for 
structure determination using NMR d. This technique 
determines a limited number of contacts in, say, proteins. 
Hence, one would like to understand how crosslinks con- 
strain the possible conformations. Finally, we mention 
the process of vulcanization where concentrated solutions 
of crosslinked polymers become amorphous. These ma- 
terials undergo a thermodynamic phase transition to a 
frozen phase if the number of crosslinks exceeds some 
critical value 

For these reasons it is desirable to understand the role 
of internal constraints in polymers. Based on mean-field 
or ideal (random walk) polymer models, recent attempts 
to address this problem have given conflicting sugges- 
tions. Gutin and Shakhnovich Q| found that the con- 
formational entropy smoothly decreases as the number 
of crosslinks increases. These authors have hinted that 
these conclusions may depend on the ensemble of links. 
Bryngelson and Thirumalai found a threshold density 
of links, scaling as 1/lniV, beyond which polymers col- 
lapse. On the other hand, it has been found |6| that the 
typical size i? of a random walk in d = 2 dimensions is 
reduced by M links to i? w {N/My, where 1/0 = 1/2 
is the random walk exponent. Kantor and Kardar [6a] 
conjectured that for a self-avoiding polymer R > {N/Aiy , 
where v is the standard correlation length exponent for 
self-avoiding walks (SAW). Accordingly, polymers col- 
lapse if the number of crosslinks scales as M ~ N'^', with 
(t)>l- l/dv. 



In this letter, we move beyond mean field to show that 
when links form freely among a random set of sites (an- 
nealed case) then polymers do not collapse to a compact 
state, unless the number of constraints scales linearly in 
N . We should point out that in all likelihood the an- 
nealed case is a better model for real polymers. To reach 
this conclusion we analyze the whole sequence space of 
a two-letter heteropolymer model with Nu "hydropho- 
bic" attracting sites and Np "hydrophilic" (or "polar") 
sites. The polymer chain is represented by a SAW of 
N — Nh + Np sites on the square lattice with spacing 
a. If two H sites are nearest neighbors, a short range 
attractive energy is assumed adding — e < to the con- 
formational energy of the chain. The only interaction, 
besides the aforementioned attraction between H sites, 
is self- avoidance which forbids two sites from occupying 
the same site. It should be mentioned that this model 
has been extensively used to study protein folding [0-^, 
there a limited number of hydrophobic sites are believed 
to play a dominant role in the folding process. At zero 
temperature, H sites form permanent links. Using exact 
series enumeration |10| of all possible crosslinked confor- 
mations, we obtain exact thermodynamic quantities for 
< 20. Analytically, we consider Flory's affine net- 
work theory of rubber elasticity to generalize some of 
our conclusions to the problem of quenched random links. 
In what follows, we work in adimensional units with a, 
e = 1. 

It is well known that at some critical "theta" temper- 
ature Tg homopolymers undergo a coil-to-globular (or 
collapse) transition. Below Tq polymers collapse to an 
average radius of gyration {Rq) scaling as N^/'^. Above 
Tg polymers are swollen (or extended) with {Rp) ^ N" , 
where v — 3/4 and 0.592 for d = 2 and 3 respec- 
tively. To analyze heteropolymers we proceed by com- 
puting the radius of gyration {Rq), where the upper bar 
means average over sequence space, i.e. [j^^) sequences 
for any given N and Nh- Although the data shown in 
this letter corresponds to = 2, it is helpful theoretically 
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to keep the symbols d and i/ in evidence. 

As shown in Fig. 1, we find that, at T 
very well described by the scaling law 



F{ph), with Ph = ^ 



0, {RD is 



(1) 



where pn = 1 — pp corresponds to the fraction of H sites. 
We note that in general one should have allowed the scal- 
ing variable p to depend on a suitable crossover exponent, 
say, Ph = Nh/N'^. Our results, however, indicate that 
(j) — 1. Eq. 1 demonstrates that polymers collapse if 
and only if Nh scales as N. This result is clearly not 
obvious. Namely, hydrophilic chains with a tiny, but fi- 
nite, fraction of randomly distributed attracting sites are 
collapsed at T = 0. 

Furthermore, the scaling function F{ph) is expected to 
have well defined asymptotic laws in both the hydrophilic 
Ph — > 0, and the hydrophobic limit pp — > 0. For pn 0, 
one should recover the self-avoiding walk exponent (as a 
function of N). Hence, 



F{pH)^Ap^J, with C = l/d- 



(2) 



This is in excellent agreement with the slope C — —1/4 
observed in Fig. 1. At T = 0, the chain ensemble corre- 
sponds to that of maximally crosslinked chains. Indeed, 
we find (M) ~ iV^ suggesting the validity of (1) 
with a scaling variable p = M/N for M annealed random 
links. Based on (2), we predict C ~ —0.259 for d = 3. It 
is noteworthy that if one fixes Nh and N —f oo, then 
the transition between SAW behavior and the collapse 
regime occurs at pn — 1 — p*p ^ 0.61 (see Fig. 1). 

The hydrophobic limit is shown in the inset of Fig. 
1. For Pp ^ 0, chains collapse, approaching a sphere 
of volume V « Na'^ + OiN"/"^), where a'^ is volume of 
lattice cell. Naively, we might expect ct to be a surface 
correction, i.e. a = d — I. The data, however, shows 



N^/d 



Rq 



Bpp 



(3) 



V(27r), and a = 0.7 ±0.1 (rf = 2) is a 



where i?g 

universal exponent describing the approach to circular- 
ity of a collapsing chain. Interestingly, (3) is related to 
the longstanding problem of how many lattice points fit 
inside a sphere of volume V , where a is known to vary 
between 1/2 and (upper bound) 7/11 < 1 [|3|. The slope 
in Fig. 1 (inset) corresponds to the scaled version of this 
exponent. For d = 3, = (3a3/47r)2/33/5 and ct < 2 
|l3| . The apparent deviations from scaling at Np 
are well known finite-size effects on the shape of collapsed 
lattice chains 

A similar analysis of the conformational entropy 
so{ph) = \nQ{pH)/N, where is number of confor- 
mations, leads to the scaling plot shown in Fig. 2. In 
the hydrophilic region pn ^0.6, So{ph) ~ G{ph)/N^, 
with X = 0.43 ± 0.04. For pn^O-G, entropy decreases 
even faster with N. In the SAW limit pn 0, sq 



approaches a constant yielding G{ph) ^ Ph^- Scahng 
breaks down due to finite-size effects at pp ^ 0{N^^/'^). 
At this point, hydrophilic sites rearrange on the surface 
of the structure and entropy approaches a constant — 
soiPH = 1) = In g/e, where q is the coordination number 
of the lattice ^,0. Hence, in sharp contrast with the 
homopolymer cases pn = and pp = 0, chains have 
zero entropy and are collapsed for any finite pn < 1 and 
N oo. Indeed, the internal network of permanent 
links formed for Nh ~ 0{N) > N^'^'^''^'^ yields enough 
constraints to change the qualitative properties of poly- 
mers. We should mention that the entropy of maximally 
compact structures in heteropolymers have already been 
shown Q to have a deep minimum around pn — 0.6. 

As a function of temperature, from the homopolymer 
(theta) case, we start increasing pp reducing the overall 
drive towards collapse. Then, as indicated in the phase 
diagram of Fig. 3, the collapse transition temperature 
Txipp) — which divides the swollen from the collapsed 
region — goes down, and eventually to zero at pp = 1. 
Kantor and Kardar |l^ have shown a related phase di- 
agram for random "charges" in a d = 3 chain. On a 
regime where both H and P sites attract each other, they 
found that chains collapse for \Nh — Np\/N between 
and 1. Note, however, that their model cannot sample 
the hydrophilic regime with less than 50% of attracting 
sites. 

Entropy is s{pH, T) ^ (E - F)/TN, where E and 
F are the energy and free energy, respectively. Upon 
crossing Tx{pp), heteropolymer chains have a sharper de- 
crease in entropy than homopolymers (see inset in Fig. 

2) . This sharpness appears to be higher than what would 
be expected for a critical transition. Moreover, the re- 
manent entropy below Tx{pp) decreases with system size, 
whereas above Tx{pp) it remains constant. All these sug- 
gest that the nature of the transition changes from crit- 
ical to 1^* order at some tricritical point T^ipp) — from 
our hmited data, we conjecture < pp < 0.4 (see Fig. 

3) . Furthermore, one could also argue that given the first 
order jump in entropy at T = and pp — 1 {N oo), by 
continuity it is reasonable to expect a line of first order 
transitions beginning at T = and going to finite tem- 
peratures. Similar phase diagrams have been obtained 
for, say, the tricritical point in a dilute magnet | |T^ . The 
analogy here is diluting a "theta" polymer. It is also 
worth mentio ning that a closely related model solved by 
Garel et. al. iQ shows that the nature of the collapse 
transition depends on the hydrophobic-hydrophilic con- 
tent of the chain. In the hydrophilic regime, the tran- 
sition is first-order. Whereas in the strong hydrophobic 
regime, the transition is continuous similar to an ordi- 
nary "theta" point. An impressive, and yet intriguing, 
result is that even if the collapse transition is continu- 
ous, the low temperature phase as no entropy (except at 
PP = 0,1). 

In the hydrophilic regime pHlip*H {T ~ 0), the av- 
erage number of conformations fl^pn) grows exponen- 
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tially in N, changing to non-exponential growth only at 
~ 0.61 ± 0.05. The change in the scaling of CI{ph) 
is rather abrupt. Below p^, InSlfpffViV increases as 
~ iP*H - PhT^ with w ~ 0.5 ± 0.1 Q. Above p*^, 
In fl{pH)/N consistently decreases towards zero. This 
means that the structural localization in some few struc- 
tures is particularly strong at and below , see also Fig. 
2. It is tempting to speculate that this special point pTris 



related to a rigidity percolation transition (see, e.g., |17|) 
from a rigid to a floppy structure, or to a "vulcanization" 
transition of a single chain. Certainly, these aspects of 
the model deserve further study. 

For completeness, we assess the question: "What hap- 
pens for crosslinks that can form arbitrarily apart along 
the backbone (quenched case)?" A general analysis of 
crosslinks in polymers can be made by means of Flory's 
affine network theory of rubber elasticity. In this frame- 
work, the total free energy of a polymer of N sites and 
M4 crosslinks of functionality four can be constructed 
(see, e.g., 10). By considering an elastic, repulsive and 
entropic energy term, plus an ideal gas as solvent, the 
typical size of a crosslinked polymer is found to be |l3| 



i? ^ p-l/(rf+2)^2/(d+2) ^-^^ p = Mi/N. 



(4) 



This expression is expected to be valid for p 1. Strik- 
ingly, as A?^ — > 00, we automatically recover Flory's expo- 
nent for a SAW with R ~ N^/(<i+'^). Moreover, for d = 2 
the theory predicts the same scaling form and exponents 
as in Eqs. 1 and 2, with R ~ p'^/^N^/'^. 

For d = 3, a new scaling behavior is predicted, namely 
R ~ p-i/5]v2/5. For a given ratio p conformations are 
neither fully collapsed nor swollen 1 1| . This behavior has 
also been implied by Levin and Barbosa [l^ ] in a study of 
phase transitions of neutral polyampholyte. The predic- 
tion is that polymers collapse if M4 ~ iV^, with = 4/3. 
However, if we also allow sites with functionality larger 
than four, then we get back to (/) = 1 as in (1). Hence, it 
is much harder to collapse a chain with two-particle links 
than with links that do not saturate. 

In summary, the theoretical and numerical study of 
heteropolymers and random crosslinked chains has re- 
vealed a variety of different regimes, some with simple 
scaling behavior, others more complex and intriguing. 
For d — 2,we find that polymers collapse if and only if the 
number of mutually attracting monomers Nh = N — Np 
scales as the size of the system . At zero temperature, 
novel universal exponents describe the limiting behavior 
for ph = Nh/N and 1. We expect the same conclu- 
sion to be valid for d = 3. For d — 2, the problem of an- 
nealed and quenched random links are predicted to have 
the same scaling properties. For d — 3, quenched (two- 
particle) links collapse a chain if M ~ N^/^ . The nature 
of the collapse transition changes from first to second or- 
der for some small enough density of non-interacting sites 
Pp = Np/N . We note that almost simultaneously with 
collapse there is an abrupt decrease in entropy. Trac- 
ing this entropic change to a folding transition suggests 



a favorable scenario to find fast folding proteins [|9|. In 
the thermodynamic limit (T = 0) random HP chains 
have zero entropy for any finite fraction pn < 1. This 
entropic crisis and collapse is due to the network of in- 
ternal constraints buried in the structure. This result 
has yet to be fully understood in the context of ran- 
dom heteropolymers with quenched disorder. Properties 
on the size, entropy, and number of conformations as a 
function of pn indicate the existence of a special point 
at p*jj ~ 0.61, most likely related to a rigidity percola- 
tion or vulcanization transition. Our results show that 
both structural determination and collapse require a rel- 
atively large number of constraints N) suggesting that 
the thermodynamics and dynamics of crosslinking should 

play an important role in protein folding [Q. 
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FIG. 1. Scaling of squared radius of gyration averaged 
over sequence space as a function of fraction of hydropho- 
bic (attracting) sites pH = Nh/N, and of (inset) hydrophilic 
(non-interacting) sites pp = Np/N. Points with Nh = 
and 1 are not shown as they correspond to unrestricted SAW. 
Symbols correspond to exact data for the square lattice. The 
limits for pu and pp ^ are indicated by the dashed lines, i.e. 
2C = -1/2 and A = .132 ± .001 (2), and (inset) cr = 0.7 and 
B = .0364 ± .003 (3), respectively. (For N > 18, some data 
points with pH ~ 0.5 are missing due to CPU constraints.) 

FIG. 2. Entropy so{ph) as a function of the scaling vari- 
able pif, at T = 0. Same symbols as in Fig. 1. Inset, entropy 
s(T) as a function of temperature for N = 15 and Np = 
(dashed line) and Np > (solid lines). Values of Np are in- 
dicated in figure, see also dotted lines in Fig. 3. The collapse 
temperature Tx{pp) is indicated by the square symbols. The 
curves for Np = 7 and 10 are indicative of a shaxper transition 
than the Np = case. 

FIG. 3. Schematic phase diagram of a heteropolymer. 
Square symbols show exact position of peak in the energy fluc- 
tuations AE = {E^) - {Ef for iV = 15 and iVp = 0, • • • , 13. 
Our data suggests the possibility of a first-order collapse 
transition (solid line) for hydrophilic chains changing to sec- 
ond-order (dashed line) for hydrophobic chains at some point 
Tx{Pp), see text. A soUd circle indicates the position of pp. 
Curves in Fig. 2 are computed along dotted lines. 
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